perm filename NOTE[900,BGB] blob
sn#129600 filedate 1974-11-11 generic text, type T, neo UTF8
00100 1 August 1969
00200
00300 CART VISION PROJECT PROPOSAL
00400
00500 Dear John,
00600 At last week's cart-meeting you seemed to feel that there
00700 was a lack-of-visible-progress with respect to the cart project.
00800 You seemed to want promises to be made in order that progress can be
00900 measured. Further on, you attempted to derive a strategy for the project
01000 to follow from gedanken experiments - strategies such as "simulation";
01100 thought-experiments such as "car-passing-on-hill". And finally, as an
01200 off-hand remark, you asked that if there were twenty people (who you
01300 would have hired had you known what to tell them to do) then what would they
01400 be doing.
01500
01600
01700 PROMISES
01800
01900 I promise:
02000 I. To use realistic road data. Film, cart and a camera on a cable;
02100 I will collect road data thru color filters.
02200 II. To write an image enhancer program - to increase contrast
02300 using JPL type operators, to focus by stealing Sobel & Tenebaum's
02400 routine, and to average over a number of frames.
02500 III. To write an image reduction program - reduces an image to a set
02600 of possibly overlapping blobs each circumscribed by a polygon
02700 and characterized by various properties such as area, average
02800 intensity, moment, convexity, perimeter-length, homogeneity,
02900 and the like - by extending my FLIP program somewhat.
03000 IV. To write an image extrapolator program - merely extends polygon-blobs
03100 beyond edges of image & guesses crude depths from focus setting.
03200 V. To write a 2D to 3D converter - Take extrapolated image-polygon & depth
03300 guess thru an inverse perspective transform with respect to lens-angle,
03400 pan,tilt and camera location.
03500 VI. To write a camera controller or use Rod's cart controller
03600 or to merely move the camera (say down the road) or to change the slide.
03700 and to issue commands or advice to the Cart controller with respect
03800 to goals or detection of obstacles.
03900 VII. To write an image predictor that generates a 2D perspective
04000 image of the road with intensities derived for a single
04100 light source such as the sun or a headlight & the polgon-blob
04200 properties. I have already written and have working a hidden line
04300 eliminator.
04400 VIII. To write a verifier-corrector - the perdicted image is
04500 checked against the previous and extrapolations are altered
04600 accordingly.
04700
04800 STRATEGY
04900
05000 I believe that attacking problems such as style-of-driving
05100 or what happens if we are passing a truck on a hill and ... are premature
05200 when the robot-driver is as to date still blind. Concentrating on vision
05300 with respect to roads - there are two
05400 problems The Image-Correlation-Problem and The World-Modeling-problem.
05500 The image-correlation-problem comes in several flavors: tracking motion of moving
05600 objects (correlating the object in several images), correlating the Views from a moving vehicle,
05700 Windowing - that is patching together images collected from a pan of the camera,
05800 kinetic depth perception, and parallax. The world-modeling-problem involves
05900 a strategy of handling most everything in a similair manner as oppose
06000 to Feldman's world-model of handling afew well know objects.
06100 I believe that the image-predictor-corrector is a realistic way to handle
06200 correlation type problems - and that the polygon-blob-property world
06300 model is a kludge that is used for want of a better ultimate high-level general
06400 representation - I really consider
06500 blob-polygons as an intermediate stage between visual data
06600 and semantic data, and would like to see a mapping or recognition step
06650 which converts polyblobs into nouns and adjectives.
06700
06800
06900 IF I HAD TWENTY PEOPLE or the money equivalent:
07000
07100 I might be able to define & coordinate specific tasks with respect to:
07300 Image-enhancement
07400 Computer-Graphics (for image predicting and for debugging)
07500 Illumination Calculation
07600 World-Models (hand written world models, and
07700 world model generators)
07800 With respect to hardware shopping:
07900 Radio Controlled Car
08000 Color Vision
08100 Video Image Output - for seeing results of digital averaging
08200 digital contrast, and illumination predictions.
08300 Hardcopy Graphical device faster than the plotter more resolution
08400 than the line printer.
08500 Ambulatory Robot
08600 Robot with an Eye-stalk (camera that can translate as well as pan/tilt).
08700